-
1.
De novo design and Rosetta-based assessment of high-affinity antibody variable regions (Fv) against the SARS-CoV-2 spike receptor binding domain (RBD).
Boorla, VS, Chowdhury, R, Ramasubramanian, R, Ameglio, B, Frick, R, Gray, JJ, Maranas, CD
Proteins. 2023;(2):196-208
-
-
Free full text
-
Abstract
The continued emergence of new SARS-CoV-2 variants has accentuated the growing need for fast and reliable methods for the design of potentially neutralizing antibodies (Abs) to counter immune evasion by the virus. Here, we report on the de novo computational design of high-affinity Ab variable regions (Fv) through the recombination of VDJ genes targeting the most solvent-exposed hACE2-binding residues of the SARS-CoV-2 spike receptor binding domain (RBD) protein using the software tool OptMAVEn-2.0. Subsequently, we carried out computational affinity maturation of the designed variable regions through amino acid substitutions for improved binding with the target epitope. Immunogenicity of designs was restricted by preferring designs that match sequences from a 9-mer library of "human Abs" based on a human string content score. We generated 106 different antibody designs and reported in detail on the top five that trade-off the greatest computational binding affinity for the RBD with human string content scores. We further describe computational evaluation of the top five designs produced by OptMAVEn-2.0 using a Rosetta-based approach. We used Rosetta SnugDock for local docking of the designs to evaluate their potential to bind the spike RBD and performed "forward folding" with DeepAb to assess their potential to fold into the designed structures. Ultimately, our results identified one designed Ab variable region, P1.D1, as a particularly promising candidate for experimental testing. This effort puts forth a computational workflow for the de novo design and evaluation of Abs that can quickly be adapted to target spike epitopes of emerging SARS-CoV-2 variants or other antigenic targets.
-
2.
Contextual protein and antibody encodings from equivariant graph transformers.
Mahajan, SP, Ruffolo, JA, Gray, JJ
bioRxiv : the preprint server for biology. 2023
Abstract
The optimal residue identity at each position in a protein is determined by its structural, evolutionary, and functional context. We seek to learn the representation space of the optimal amino-acid residue in different structural contexts in proteins. Inspired by masked language modeling (MLM), our training aims to transduce learning of amino-acid labels from non-masked residues to masked residues in their structural environments and from general (e.g., a residue in a protein) to specific contexts (e.g., a residue at the interface of a protein or antibody complex). Our results on native sequence recovery and forward folding with AlphaFold2 suggest that the amino acid label for a protein residue may be determined from its structural context alone (i.e., without knowledge of the sequence labels of surrounding residues). We further find that the sequence space sampled from our masked models recapitulate the evolutionary sequence neighborhood of the wildtype sequence. Remarkably, the sequences conditioned on highly plastic structures recapitulate the conformational flexibility encoded in the structures. Furthermore, maximum-likelihood interfaces designed with masked models recapitulate wildtype binding energies for a wide range of protein interfaces and binding strengths. We also propose and compare fine-tuning strategies to train models for designing CDR loops of antibodies in the structural context of the antibody-antigen interface by leveraging structural databases for proteins, antibodies (synthetic and experimental) and protein-protein complexes. We show that pretraining on more general contexts improves native sequence recovery for antibody CDR loops, especially for the hypervariable CDR H3, while fine-tuning helps to preserve patterns observed in special contexts.
-
3.
Hallucinating structure-conditioned antibody libraries for target-specific binders.
Mahajan, SP, Ruffolo, JA, Frick, R, Gray, JJ
Frontiers in immunology. 2022;:999034
Abstract
Antibodies are widely developed and used as therapeutics to treat cancer, infectious disease, and inflammation. During development, initial leads routinely undergo additional engineering to increase their target affinity. Experimental methods for affinity maturation are expensive, laborious, and time-consuming and rarely allow the efficient exploration of the relevant design space. Deep learning (DL) models are transforming the field of protein engineering and design. While several DL-based protein design methods have shown promise, the antibody design problem is distinct, and specialized models for antibody design are desirable. Inspired by hallucination frameworks that leverage accurate structure prediction DL models, we propose the FvHallucinator for designing antibody sequences, especially the CDR loops, conditioned on an antibody structure. Such a strategy generates targeted CDR libraries that retain the conformation of the binder and thereby the mode of binding to the epitope on the antigen. On a benchmark set of 60 antibodies, FvHallucinator generates sequences resembling natural CDRs and recapitulates perplexity of canonical CDR clusters. Furthermore, the FvHallucinator designs amino acid substitutions at the VH-VL interface that are enriched in human antibody repertoires and therapeutic antibodies. We propose a pipeline that screens FvHallucinator designs to obtain a library enriched in binders for an antigen of interest. We apply this pipeline to the CDR H3 of the Trastuzumab-HER2 complex to generate in silico designs predicted to improve upon the binding affinity and interfacial properties of the original antibody. Thus, the FvHallucinator pipeline enables generation of inexpensive, diverse, and targeted antibody libraries enriched in binders for antibody affinity maturation.
-
4.
A humanized yeast system to analyze cleavage of prelamin A by ZMPSTE24.
Spear, ED, Alford, RF, Babatz, TD, Wood, KM, Mossberg, OW, Odinammadu, K, Shilagardi, K, Gray, JJ, Michaelis, S
Methods (San Diego, Calif.). 2019;:47-55
-
-
Free full text
-
Abstract
The nuclear lamins A, B, and C are intermediate filament proteins that form a nuclear scaffold adjacent to the inner nuclear membrane in higher eukaryotes, providing structural support for the nucleus. In the past two decades it has become evident that the final step in the biogenesis of the mature lamin A from its precursor prelamin A by the zinc metalloprotease ZMPSTE24 plays a critical role in human health. Defects in prelamin A processing by ZMPSTE24 result in premature aging disorders including Hutchinson Gilford Progeria Syndrome (HGPS) and related progeroid diseases. Additional evidence suggests that defects in prelamin A processing, due to diminished ZMPSTE24 expression or activity, may also drive normal physiological aging. Because of the important connection between prelamin A processing and human aging, there is increasing interest in how ZMPSTE24 specifically recognizes and cleaves its substrate prelamin A, encoded by LMNA. Here, we describe two humanized yeast systems we have recently developed to examine ZMPSTE24 processing of prelamin A. These systems differ from one another slightly. Version 1.0 is optimized to analyze ZMPSTE24 mutations, including disease alleles that may affect the function or stability of the protease. Using this system, we previously showed that some ZMPSTE24 disease alleles that affect stability can be rescued by the proteasome inhibitor bortezomib, which may have therapeutic implications. Version 2.0 is designed to analyze LMNA mutations at or near the ZMPSTE24 processing site to assess whether they permit or impede prelamin A processing. Together these systems offer powerful methodology to study ZMPSTE24 disease alleles and to dissect the specific residues and features of the lamin A tail that are required for recognition and cleavage by the ZMPSTE24 protease.
-
5.
Viral infection causes a shift in the self peptide repertoire presented by human MHC class I molecules.
Spencer, CT, Bezbradica, JS, Ramos, MG, Arico, CD, Conant, SB, Gilchuk, P, Gray, JJ, Zheng, M, Niu, X, Hildebrand, W, et al
Proteomics. Clinical applications. 2015;(11-12):1035-52
-
-
Free full text
-
Abstract
PURPOSE MHC class I presentation of peptides allows T cells to survey the cytoplasmic protein milieu of host cells. During infection, presentation of self peptides is, in part, replaced by presentation of microbial peptides. However, little is known about the self peptides presented during infection, despite the fact that microbial infections alter host cell gene expression patterns and protein metabolism. EXPERIMENTAL DESIGN The self peptide repertoire presented by HLA-A*01;01, HLA-A*02;01, HLA-B*07;02, HLA-B*35;01, and HLA-B*45;01 (where HLA is human leukocyte antigen) was determined by tandem MS before and after vaccinia virus infection. RESULTS We observed a profound alteration in the self peptide repertoire with hundreds of self peptides uniquely presented after infection for which we have coined the term "self peptidome shift." The fraction of novel self peptides presented following infection varied for different HLA class I molecules. A large part (approximately 40%) of the self peptidome shift arose from peptides derived from type I interferon-inducible genes, consistent with cellular responses to viral infection. Interestingly, approximately 12% of self peptides presented after infection showed allelic variation when searched against approximately 300 human genomes. CONCLUSION AND CLINICAL RELEVANCE Self peptidome shift in a clinical transplant setting could result in alloreactivity by presenting new self peptides in the context of infection-induced inflammation.
-
6.
Targeted DNA methylation using an artificially bisected M.HhaI fused to zinc fingers.
Chaikind, B, Kilambi, KP, Gray, JJ, Ostermeier, M
PloS one. 2012;(9):e44852
Abstract
Little is known about the effects of single DNA methylation events on gene transcription. The ability to direct the methylation toward a single unique site within a genome would have broad use as a tool to study the effects of specific epigenetic changes on transcription. A targeted enzyme might also be useful in a therapy for diseases with an epigenetic component or as a means to site-specifically label DNA. Previous studies have sought to target methyltransferase activity by fusing DNA binding proteins to methyltransferases. However, the methyltransferase domain remains active even when the DNA binding protein is unbound, resulting in significant off-target methylation. A better strategy would make methyltransferase activity contingent upon the DNA binding protein's association with its DNA binding site. We have designed targeted methyltransferases by fusing zinc fingers to the fragments of artificially-bisected, assembly-compromised methyltransferases. The zinc fingers' binding sites flank the desired target site for methylation. Zinc finger binding localizes the two fragments near each other encouraging their assembly only over the desired site. Through a combination of molecular modeling and experimental optimization in E. coli, we created an engineered methyltransferase derived from M.HhaI with 50-60% methylation at a target site and nearly undetectable levels of methylation at a non-target M.HhaI site (1.4 ± 2.4%). Using a restriction digestion assay, we demonstrate that localization of both fragments synergistically increases methylation at the target site, illustrating the promise of our approach.
-
7.
Prediction of calcite morphology from computational and experimental studies of mutations of a de novo-designed peptide.
Schrier, SB, Sayeg, MK, Gray, JJ
Langmuir : the ACS journal of surfaces and colloids. 2011;(18):11520-7
Abstract
Many organisms use macromolecules, often proteins or peptides, to control the growth of inorganic crystals into complex materials. The ability to model peptide-mineral interactions accurately could allow for the design of novel peptides to produce materials with desired properties. Here, we tested a computational algorithm developed to predict the structure of peptides on mineral surfaces. Using this algorithm, we analyzed energetic and structural differences between a 16-residue peptide (bap4) designed to interact with a calcite growth plane and single- and double-point mutations of the charged residues. Currently, no experimental method is available to resolve the structures of proteins on solid surfaces, which precludes benchmarking for computational models. Therefore, to test the models, we chemically synthesized each peptide and analyzed its effects on calcite crystal growth. Whereas bap4 affected the crystal growth by producing heavily stepped corners and edges, point mutants had variable influences on morphology. Calculated residue-specific binding energies correlated with experimental observations; point mutations of residues predicted to be crucial to surface interactions produced morphologies most similar to unmodified calcite. These results suggest that peptide conformation plays a role in mineral interactions and that the computational model supplies valid energetic and structural data that can provide information about expected crystal morphology.
-
8.
Structurally distinct toxicity inhibitors bind at common loci on β-amyloid fibril.
Keshet, B, Gray, JJ, Good, TA
Protein science : a publication of the Protein Society. 2010;(12):2291-304
Abstract
The accumulation of aggregated β-Amyloid (Aβ) in the brain is a hallmark of Alzheimer's disease and is thought to play a role in the neurotoxicity associated with the disease. The mechanism by which Aβ aggregates induce toxicity is uncertain. Nonetheless, several small molecules have been found to interact with Aβ fibrils and to prevent their toxicity. In this paper we studied the binding of these known toxicity inhibitors to Aβ fibrils, as a means to explore surfaces or loci on Aβ aggregates that may be significant in the mechanism of action of these inhibitors. We believe knowledge of these binding loci will provide insight into surfaces on the Aβ fibrils important in Aβ biological activity. The program DOCK was used to computationally dock the inhibitors to an Aβ fibril. The inhibitors docked at two shared binding loci, near Lys28 and at the C-termini near Asn27 and Val39. The docking predictions were experimentally verified using lysine specific chemical modifications and Aβ fibrils mutated at Asn27. We found that both Congo red and Myricetin, despite being structurally different, bound at the same two sites. Additionally, our data suggests that three additional Aβ toxicity inhibitors may also bind in one of the sites. Identification of these common binding loci provides targets on the Aβ fibril surface that can be tested in the future for their role in Aβ biological activity.
-
9.
Applying linear interaction energy method for rational design of noncompetitive allosteric inhibitors of the sarco- and endoplasmic reticulum calcium-ATPase.
Singh, P, Mhaka, AM, Christensen, SB, Gray, JJ, Denmeade, SR, Isaacs, JT
Journal of medicinal chemistry. 2005;(8):3005-14
Abstract
Noncompetitive inhibitors of sarco- and endoplasmic reticulum calcium-ATPase (SERCA) have important therapeutic value in the treatment of cancer, due to their ability to induce apoptosis in cancer cells in a proliferation-independent manner. Thapsigargin (TG) and its analogues are one such class of inhibitors that bind to a hydrophobic pocket located in the transmembrane region of SERCA near the biomembrane surface and interfere with calcium transport. The binding free energies of thapsigargin-based inhibitors of SERCA were computed using a novel linear interaction energy (LIE) method with a surface generalized Born (SGB) continuum solvation model. A training set of 20 TG analogues was used to build a binding affinity model for estimating the free energy of binding for 18 new inhibitors with a root-mean-square (rms) error of 1.36 kcal/mol with respect to experimental data. For 15 out of the 18 inhibitors in the test set, the rms error was 1.02 kcal/mol, which is on the order of the accuracy level achieved by highly rigorous free energy of perturbation (FEP) or thermodynamic integration (TI) methods. On the basis of the analysis of the binding cavity at the interface of the membrane surface and the cytoplasmic region, we propose that side chains of TG derivatives at the O-8 position orient toward the cytoplasmic region through a hydrophobic channel. On the basis of this insight, four analogues of varying side chain length at the O-8 position with a charged moiety at the end were designed, tested with LIE methodology, and then validated experimentally for their SERCA inhibition activity. Low levels of rms error for the majority of inhibitors establish the structure-based LIE method as an efficient tool for generating more potent and specific inhibitors of SERCA by testing rationally designed lead compounds based on thapsigargin derivatization.
-
10.
Protein-protein docking predictions for the CAPRI experiment.
Gray, JJ, Moughon, SE, Kortemme, T, Schueler-Furman, O, Misura, KM, Morozov, AV, Baker, D
Proteins. 2003;(1):118-22
Abstract
We predicted structures for all seven targets in the CAPRI experiment using a new method in development at the time of the challenge. The technique includes a low-resolution rigid body Monte Carlo search followed by high-resolution refinement with side-chain conformational changes and rigid body minimization. Decoys (approximately 10(6) per target) were discriminated using a scoring function including van der Waals and solvation interactions, hydrogen bonding, residue-residue pair statistics, and rotamer probabilities. Decoys were ranked, clustered, manually inspected, and selected. The top ranked model for target 6 predicted the experimental structure to 1.5 A RMSD and included 48 of 65 correct residue-residue contacts. Target 7 was predicted at 5.3 A RMSD with 22 of 37 correct residue-residue contacts using a homology model from a known complex structure. Using a preliminary version of the protocol in round 1, target 1 was predicted within 8.8 A although few contacts were correct. For targets 2 and 3, the interface locations and a small fraction of the contacts were correctly identified.